Demography module#
Variables#
hsize
hsize codes the size of the household. This should not include individuals who may be living in the household but do not form an economic unit with the other household members (e.g., a live-in maid is not part of the household).
age
age refers to the interval of time between the date of birth and the date of the survey. Every effort should be made to determine the precise and accurate age of each person, particularly of children and older persons. Information on age may be secured either by obtaining the date (year, month, and day) of birth or by asking directly for age at the person’s last birthday. In addition, in the case of children aged less than or equal to 60 months, variable age should be expressed in the number of completed years and months in decimals. For example, If the interview of a 4 years old was in December and he was born in June, his age should be recorded as 4.5. Lastly, if the information on age is not available, it should be coded as missing rather than some other value such as “99” or “999”.
male
male is a dummy variable that specifies the sex – male or female – of an individual within a household. While constructing this variable, it is important to make sure that all relevant values are included. Variable values coded as ‘98’ or other numeric characters should be excluded from the values of the `male’ variable. Sex of household member, two categories after harmonization:
1 = male 0 = female
relationharm
relationharm is a string variable that indicates a relationship to the reference person of household (usually the head of household). Variable values coded as ‘98’ or other numeric characters should be excluded from the values of relationharm variable.
Relationship to head of household, six categories after harmonization:
1=head 2=spouse 3=children 4=parents 5=other relatives 6=non-relatives
Note: In cases where head is missing or a migrant, we assign spouse as the head of the household. If spouse is also not available, then we will use oldest member of the household as the head and recode all the relations to head accordingly.
relationcs
relationcs is a country-specific categorical variable that indicates the relationship to the head of the household. The categories for relationship to the head of the household are defined according to the region or country requirements.
marital
Marital is a categorical variable that refers to the personal status of each individual in relation to the marriage laws or customs of the country. The categories of marital status to be identified should include at least the following: (a) single (in other words, never married); (b) married; (c) married but separated; (d) windowed and remarried; (e) divorced and not remarried.
In some countries, category (b) may require a subcategory of persons who are contractually married but not yet living as man and wife. In all countries, category (c) should comprise both the legally and the de facto separated, who may be shown as separate subcategories if desired. The marital variable should not be imputed but rather calculated only for those to whom the question was asked (in other words, the youngest age at which information is collected may differ depending on the survey).
The consistency between age and marital needs to be cross-checked. In most countries, there are also likely to be persons who were permitted to marry below the legal minimum age because of special circumstances. To permit international comparisons of data on marital status, however, any tabulations of marital status not cross-classified by exact age should at least distinguish between persons under 15 years of age and over. If it is not possible to distinguish between married and living together, then it should be assumed that the individual is married. Variable values coded as ‘98’ or other numeric characters should be excluded from the values of the ‘marital’ variable.
Marital status, five categories after harmonization:
1=married 2=never married 3=living together 4=divorced/separated 5=widowed
eye_dsablty
eye_dsablty is a numerical variable that indicates whether an individual has any difficulty in seeing, even when wearing glasses. Categories after harmonization:
1 = No – no difficulty 2 = Yes – some difficulty 3 = Yes – a lot of difficulty 4 = Cannot do at all
hear_dsablty
hear_dsablty is a numerical variable that indicates whether an individual has any difficulty in hearing even when using a hearing aid. Categories after harmonization:
1 = No – no difficulty 2 = Yes – some difficulty 3 = Yes – a lot of difficulty 4 = Cannot do at all
walk_dsablty
walk_dsablty is a numerical variable that indicates whether an individual has any difficulty in walking or climbing steps. Categories after harmonization:
1 = No – no difficulty 2 = Yes – some difficulty 3 = Yes – a lot of difficulty 4 = Cannot do at all
conc_dsord
conc_dsord is a numerical variable that indicates whether an individual has any difficulty concentrating or remembering. Categories after harmonization:
1 = No – no difficulty 2 = Yes – some difficulty 3 = Yes – a lot of difficulty 4 = Cannot do at all
slfcre_dsablty
slfcre_dsablty is a numerical variable that indicates whether an individual has any difficulty with self-care such as washing all over or dressing. Categories after harmonization:
1 = No – no difficulty 2 = Yes – some difficulty 3 = Yes – a lot of difficulty 4 = Cannot do at all
comm_dsablty
comm_dsablty is a numerical variable that indicates whether an individual has any difficulty communicating or understanding usual (customary) language. Categories after harmonization:
1 = No – no difficulty 2 = Yes – some difficulty 3 = Yes – a lot of difficulty 4 = Cannot do at all
Lessons Learned and Challenges#
Data sets that are harmonized incorrectly can lead to skewed and/or incorrect data analysis. Harmonizers should run a series of checks to ensure data is harmonized properly, including the following:
Check to make sure that age is an integer since 5 years old.
age/int(age)!= 1 & age!= . & age > 5
age cannot have negative or extreme values (>120)
(age < 0 | age>120) & age<.
Age cannot be missing
age==.
Male variable can only take one of two values 0 or 1 (or missing).
male!=. & male!= 1 & male!= 0
Check if male is missing.
male==.
Check to make sure that there is variation in male
egen sdmale = sd(male) // sdmale should be 0
relationharm must be an integer in the range [1,6].
relationharm<1 & relationharm>6 & mod(relationharm, 1) == 1
marital must be an integer in the range [1,5].
marital<0 & marital>5 & mod(marital, 1) == 1
Children are “Never married” and should be coded as so even though it may be perceived as obvious. The marital status of individuals should be harmonized for all individuals. Harmonizers should check to make sure children are not systematically left with missing values for marital.
tab age marital, missing
weight cannot be missing
weight==.
Additionally, harmonizers should ensure that the household size variable is calculated correctly. Not all the individuals reported in a household that form the raw data are current household members. For example, for the EU-SILC survey, a household contains the current member, but also the members of the previous survey who have left the household for reasons such as death or migration.
Overview of Variables#
Module Code |
Variable name |
Variable label |
Notes |
---|---|---|---|
Demography |
hsize |
Household size |
|
Demography |
age |
Age in years |
|
Demography |
male |
Binary - Individual is male |
|
Demography |
relationharm |
Relationship to head of household harmonized across all regions |
GMD - Harmonized categories across all regions. Same as I2D2 categories. |
Demography |
relationcs |
Relationship to head of household country/region specific |
country or regionally specific categories |
Demography |
marital |
Marital status |
|
Demography |
eye_dsablty |
Difficulty seeing |
See “Recommended Short Set of Questions” for all disability questions |
Demography |
hear_dsablty |
Difficulty hearing |
|
Demography |
walk_dsablty |
Difficulty walking / steps |
|
Demography |
conc_dsord |
Difficulty concentrating |
|
Demography |
slfcre_dsablty |
Difficulty w/ selfcare |
|
Demography |
comm_dsablty |
Difficulty communicating |